Audio-driven Nonlinear Video Diffusion
نویسندگان
چکیده
In this paper we present a novel nonlinear video diffusion approach based on the fusion of information in audio and video channels. Both modalities are efficiently combined into a diffusion coefficient that integrates the basic assumption in this domain, i.e. related events in audio and video channels occur approximately at the same time. The proposed diffusion coefficient depends thus on an estimate of the synchrony between sounds and video motion. As a result, information in video parts whose motion is not coherent with the soundtrack is reduced and the sound sources are automatically highlighted. Several tests on challenging real-world sequences presenting important auditive and/or visual distractors demonstrate that our approach is able to prevail regions which are related to the soundtrack. In addition, we propose an application to the extraction of audio-related video regions by unsupervised segmentation in order to illustrate the capabilities of our method. To the best of our knowledge, this is the first nonlinear video diffusion approach which integrates information from the audio modality.
منابع مشابه
Speech-driven facial animation using a hierarchical model - Vision, Image and Signal Processing, IEE Proceedings-
A system capable of producing near video-realistic animation of a speaker given only speech inputs is presented. The audio input is a continuous speech signal, requires no phonetic labelling and is speaker-independent. The system requires only a short video training corpus of a subject speaking a list of viseme-targeted words in order to achieve convincing realistic facial synthesis. The system...
متن کاملShort-term and Long-term Impact of Video-driven Metapragmatic Awareness Raising on Speech Act Production: A Case of Iranian Interme-diate EFL Learners
متن کامل
Pragmatic comprehension of apology, request and refusal: An investigation on the effect of consciousness-raising video-driven prompts
Recent research in interlanguage pragmatics (ILP) has substantiated that some aspects of pragmatics are amenable to instruction in the second or foreign language classroom. However, there are still controversies over the most conducive teaching approaches and the required materials. Therefore, this study aims to investigate the relative effectiveness of conscio...
متن کاملDenoising of Audio Data by Nonlinear Diffusion
Nonlinear diffusion has long proven its capability for discontinuity-preserving removal of noise in image processing. We investigate the possibility to employ diffusion ideas for the denoising of audio signals. An important difference between image and audio signals is which parts of the signal are considered as useful information and noise. While small-scale oscillations in visual images are n...
متن کاملAudiovisual Attention Modeling and Salient Event Detection
Although human perception appears to be automatic and unconscious, complex sensory mechanisms exist that form the preattentive component of understanding and lead to awareness. Considerable research has been carried out into these preattentive mechanisms and computational models have been developed for similar problems in the fields of computer vision and speech analysis. The focus here is to e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011